8 resultados para proteome

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the post genomic era with the massive production of biological data the understanding of factors affecting protein stability is one of the most important and challenging tasks for highlighting the role of mutations in relation to human maladies. The problem is at the basis of what is referred to as molecular medicine with the underlying idea that pathologies can be detailed at a molecular level. To this purpose scientific efforts focus on characterising mutations that hamper protein functions and by these affect biological processes at the basis of cell physiology. New techniques have been developed with the aim of detailing single nucleotide polymorphisms (SNPs) at large in all the human chromosomes and by this information in specific databases are exponentially increasing. Eventually mutations that can be found at the DNA level, when occurring in transcribed regions may then lead to mutated proteins and this can be a serious medical problem, largely affecting the phenotype. Bioinformatics tools are urgently needed to cope with the flood of genomic data stored in database and in order to analyse the role of SNPs at the protein level. In principle several experimental and theoretical observations are suggesting that protein stability in the solvent-protein space is responsible of the correct protein functioning. Then mutations that are found disease related during DNA analysis are often assumed to perturb protein stability as well. However so far no extensive analysis at the proteome level has investigated whether this is the case. Also computationally methods have been developed to infer whether a mutation is disease related and independently whether it affects protein stability. Therefore whether the perturbation of protein stability is related to what it is routinely referred to as a disease is still a big question mark. In this work we have tried for the first time to explore the relation among mutations at the protein level and their relevance to diseases with a large-scale computational study of the data from different databases. To this aim in the first part of the thesis for each mutation type we have derived two probabilistic indices (for 141 out of 150 possible SNPs): the perturbing index (Pp), which indicates the probability that a given mutation effects protein stability considering all the “in vitro” thermodynamic data available and the disease index (Pd), which indicates the probability of a mutation to be disease related, given all the mutations that have been clinically associated so far. We find with a robust statistics that the two indexes correlate with the exception of all the mutations that are somatic cancer related. By this each mutation of the 150 can be coded by two values that allow a direct comparison with data base information. Furthermore we also implement computational methods that starting from the protein structure is suited to predict the effect of a mutation on protein stability and find that overpasses a set of other predictors performing the same task. The predictor is based on support vector machines and takes as input protein tertiary structures. We show that the predicted data well correlate with the data from the databases. All our efforts therefore add to the SNP annotation process and more importantly found the relationship among protein stability perturbation and the human variome leading to the diseasome.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Urine is considered an ideal source of biomarkers, however in veterinary medicine a complete study on the urine proteome is still lacking. The present work aimed to apply proteomic techniques to the separation of the urine proteome in dogs, cats, horses, cows and some non-conventional species. High resolution electrophoresis (HRE) was also validated for the quantification of albuminuria in dogs and cats. In healthy cats, applying SDS-PAGE and 2DE coupled to mass spectrometry (MS), was produced a reference map of the urine proteome. Moreover, 13 differentially represented urine proteins were linked with CKD, suggesting uromodulin, cauxin, CFAD, Apo-H, RBP and CYSM as candidate biomarkers to be investigated further. In dogs, applying SDS-PAGE coupled to MS, was highlighted a specific pattern in healthy animals showing important differences in patients affected by leishmaniasis. In particular, uromodulin could be a putative biomarker of tubular damage while arginine esterase and low MW proteins needs to be investigated further. In cows, applying SDS-PAGE, were highlighted different patterns between heifers and cows showing some interesting changes during pregnancy. In particular, putative alpha-fetoprotein and b-PAP needs to be further investigated. In horses, applying SDS-PAGE, was produced a reference profile characterized by 13±4 protein bands and the most represented one was the putative uromodulin. Proteinuric horses showed the decrease of the putative uromodulin band and the appearance of 2 to 4 protein bands at higher MW and a greater variability in the range of MW between 49 and 17 kDa. In felids and giraffes was quantified proteinuria reporting the first data for UTP and UPC. Moreover, by means of SDS-PAGE, were highlighted species-specific electrophoretic patterns in big felids and giraffes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A systematic characterization of the composition and structure of the bacterial cell-surface proteome and its complexes can provide an invaluable tool for its comprehensive understanding. The knowledge of protein complexes composition and structure could offer new, more effective targets for a more specific and consequently effective immune response against a complex instead of a single protein. Large-scale protein-protein interaction screens are the first step towards the identification of complexes and their attribution to specific pathways. Currently, several methods exist for identifying protein interactions and protein microarrays provide the most appealing alternative to existing techniques for a high throughput screening of protein-protein interactions in vitro under reasonably straightforward conditions. In this study approximately 100 proteins of Group A Streptococcus (GAS) predicted to be secreted or surface exposed by genomic and proteomic approaches were purified in a His-tagged form and used to generate protein microarrays on nitrocellulose-coated slides. To identify protein-protein interactions each purified protein was then labeled with biotin, hybridized to the microarray and interactions were detected with Cy3-labelled streptavidin. Only reciprocal interactions, i. e. binding of the same two interactors irrespective of which of the two partners is in solid-phase or in solution, were taken as bona fide protein-protein interactions. Using this approach, we have identified 20 interactors of one of the potent toxins secreted by GAS and known as superantigens. Several of these interactors belong to the molecular chaperone or protein folding catalyst families and presumably are involved in the secretion and folding of the superantigen. In addition, a very interesting interaction was found between the superantigen and the substrate binding subunit of a well characterized ABC transporter. This finding opens a new perspective on the current understanding of how superantigens are modified by the bacterial cell in order to become major players in causing disease.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Host-Pathogen Interaction is a very vast field of biological sciences, indeed every year many un- known pathogens are uncovered leading to an exponential growth of this field. The present work lyes between its boundaries, touching different aspects of host-pathogen interaction: We have evaluate the permissiveness of Mesenchimal Stem cell (FM-MSC from now on) to all known human affecting herpesvirus. Our study demonstrate that FM-MSC are full permissive to HSV1, HSV2, HCMV and VZV. On the other hand HHV6, HHV7, EBV and HHV8 are susceptible, but failed to activate a lytic infection program. FM-MSC are pluripotent stem cell and have been studied intensely in last decade. FM-MSC are employed in some clinical applications. For this reason it is important to known the degree of susceptibility to transmittable pathogens. Our atten- tion has then moved to bacterial pathogens: we have performed a proteome-wide in silico analy- sis of Chlamydiaceae family, searching for putative Nuclear localization Signal (NLS). Chlamy- diaceae are a family of obligate intracellular parasites. It’s reasonably to think that its members could delivered to nucleus effector proteins via NLS sequences: if that were the case the identifi- cation of NLS carrying proteins could open the way to therapeutic approaches. Our results strengthen this hypothesis: we have identified 72 protein bearing NLS, and verified their func- tionality with in vivo assays. Finally we have conceived a molecular scissor, creating a fusion protein between HIV-1 IN protein and FokI catalytic domain (a deoxyexonuclease domain). Our aim is to obtain chimeric enzyme (trojIN) which selectively identify IN naturally occurring target (HIV LTR sites) and cleaves subsequently LTR carrying DNA (for example integrated HIV1 DNA). Our preliminary results are promising since we have identified trojIN mutated version capable to selectively recognize LTR carrying DNA in an in vitro experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recent advances in the fast growing area of therapeutic/diagnostic proteins and antibodies - novel and highly specific drugs - as well as the progress in the field of functional proteomics regarding the correlation between the aggregation of damaged proteins and (immuno) senescence or aging-related pathologies, underline the need for adequate analytical methods for the detection, separation, characterization and quantification of protein aggregates, regardless of the their origin or formation mechanism. Hollow fiber flow field-flow fractionation (HF5), the miniaturized version of FlowFFF and integral part of the Eclipse DUALTEC FFF separation system, was the focus of this research; this flow-based separation technique proved to be uniquely suited for the hydrodynamic size-based separation of proteins and protein aggregates in a very broad size and molecular weight (MW) range, often present at trace levels. HF5 has shown to be (a) highly selective in terms of protein diffusion coefficients, (b) versatile in terms of bio-compatible carrier solution choice, (c) able to preserve the biophysical properties/molecular conformation of the proteins/protein aggregates and (d) able to discriminate between different types of protein aggregates. Thanks to the miniaturization advantages and the online coupling with highly sensitive detection techniques (UV/Vis, intrinsic fluorescence and multi-angle light scattering), HF5 had very low detection/quantification limits for protein aggregates. Compared to size-exclusion chromatography (SEC), HF5 demonstrated superior selectivity and potential as orthogonal analytical method in the extended characterization assays, often required by therapeutic protein formulations. In addition, the developed HF5 methods have proven to be rapid, highly selective, sensitive and repeatable. HF5 was ideally suitable as first dimension of separation of aging-related protein aggregates from whole cell lysates (proteome pre-fractionation method) and, by HF5-(UV)-MALS online coupling, important biophysical information on the fractionated proteins and protein aggregates was gathered: size (rms radius and hydrodynamic radius), absolute MW and conformation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Il progresso tecnologico nel campo della biologia molecolare, pone la comunità scientifica di fronte all’esigenza di dare un’interpretazione all’enormità di sequenze biologiche che a mano a mano vanno a costituire le banche dati, siano esse proteine o acidi nucleici. In questo contesto la bioinformatica gioca un ruolo di primaria importanza. Un nuovo livello di possibilità conoscitive è stato introdotto con le tecnologie di Next Generation Sequencing (NGS), per mezzo delle quali è possibile ottenere interi genomi o trascrittomi in poco tempo e con bassi costi. Tra le applicazioni del NGS più rilevanti ci sono senza dubbio quelle oncologiche che prevedono la caratterizzazione genomica di tessuti tumorali e lo sviluppo di nuovi approcci diagnostici e terapeutici per il trattamento del cancro. Con l’analisi NGS è possibile individuare il set completo di variazioni che esistono nel genoma tumorale come varianti a singolo nucleotide, riarrangiamenti cromosomici, inserzioni e delezioni. Va però sottolineato che le variazioni trovate nei geni vanno in ultima battuta osservate dal punto di vista degli effetti a livello delle proteine in quanto esse sono le responsabili più dirette dei fenotipi alterati riscontrabili nella cellula tumorale. L’expertise bioinformatica va quindi collocata sia a livello dell’analisi del dato prodotto per mezzo di NGS ma anche nelle fasi successive ove è necessario effettuare l’annotazione dei geni contenuti nel genoma sequenziato e delle relative strutture proteiche che da esso sono espresse, o, come nel caso dello studio mutazionale, la valutazione dell’effetto della variazione genomica. È in questo contesto che si colloca il lavoro presentato: da un lato lo sviluppo di metodologie computazionali per l’annotazione di sequenze proteiche e dall’altro la messa a punto di una pipeline di analisi di dati prodotti con tecnologie NGS in applicazioni oncologiche avente come scopo finale quello della individuazione e caratterizzazione delle mutazioni genetiche tumorali a livello proteico.